Stereo image processing using contours

ABSTRACT

A computer-implemented stereo image processing method which uses contours is described. In an embodiment, contours are extracted from two silhouette images captured at substantially the same time by a stereo camera of at least part of an object in a scene. Stereo correspondences between contour points on corresponding scanlines in the two contour images (one corresponding to each silhouette image in the stereo pair) are calculated on the basis of contour point comparison metrics, such as the compatibility of the normal of the contours and/or a distance along the scanline between the point and a centroid of the contour. A corresponding system is also described.

RELATED APPLICATIONS

This application claims priority to U.S. application Ser. No.14/154,825, filed on Jan. 14, 2014, and entitled “STEREO IMAGEPROCESSING USING CONTOURS.” This application claims the benefit of theabove-identified application, and the disclosure of the above-identifiedapplication is hereby incorporated by reference in its entirety as ifset forth herein in full.

BACKGROUND

Natural user interfaces (NUI) have captured the imagination of many, asthey shift the paradigm of human-computer interaction away from thetraditional mouse and keyboard, towards more expressive inputmodalities. Whilst the term is broad and can encompass touch, gesture,gaze, voice, and tangible input, NUI often implies leveraging thedexterity that the higher degrees-of-freedom (DoF) of our hands allowfor interaction.

With the advent of consumer depth cameras, many new systems for in-airinteractions coupled with surface-based interactions have appeared.Depth cameras estimate depth by projecting dynamic patterns onto a sceneand capturing images of the projected patterns with a stereo camera.Using pattern recognition, the camera system is able to estimate depthbased on discrepancies between the positions of recognized patterns ineach pair of left and right input images captured by the stereo camera.

To obtain such discrepancies, the camera system uses stereo imageprocessing or stereo-matching algorithms to identify correspondingpoints in each pair of input images (left and right) captured by thestereo camera, where the points in the two input images are projectionsfrom the same scene point.

Stereo matching algorithms, along with the subsequent computation ofdepth, may incur significant computational cost. Furthermore, to dealwith movement, patterns need to be projected and imaged at high framerates, which involves expensive hardware.

Researchers continue to look for new ways of reducing computational andprocurement costs, whilst retaining or increasing precision.

The examples described below are not limited to implementations whichsolve any or all of the disadvantages of known natural user interface(NUI) technologies.

SUMMARY

The following presents a simplified summary of the disclosure in orderto provide a basic understanding to the reader. This summary is not anextensive overview of the disclosure and it does not identifykey/critical elements or delineate the scope of the specification. Itssole purpose is to present a selection of concepts disclosed herein in asimplified form as a prelude to the more detailed description that ispresented later.

A computer-implemented stereo image processing method which usescontours is described. In an embodiment, contours are extracted from twosilhouette images captured at substantially the same time by a stereocamera of at least part of an object in a scene. Stereo correspondencesbetween contour points on corresponding scanlines in the two contourimages (one corresponding to each silhouette image in the stereo pair)are calculated on the basis of contour point comparison metrics, such asthe compatibility of the normal of the contours and/or a distance alongthe scanline between the point and a centroid of the contour. Acorresponding system is also described.

Many of the attendant features will be more readily appreciated as thesame becomes better understood by reference to the following detaileddescription considered in connection with the accompanying drawings.

DESCRIPTION OF THE DRAWINGS

The present description will be better understood from the followingdetailed description read in light of the accompanying drawings,wherein:

FIG. 1 illustrates a scenario involving an exemplary computing-baseddevice in which examples of a stereo image processing system and methodmay be implemented;

FIG. 2 is a schematic diagram illustrating the computing-based device ofFIG. 1 comprising an example of a stereo image processing system;

FIG. 3 is a schematic diagram illustrating part of the example of thestereo image processing system of FIG. 2 in greater detail;

FIG. 4 illustrates left and right images captured by a stereo camerabeing processed according to an example of a stereo image processingmethod;

FIG. 5 illustrates further aspects of the example of the stereo imageprocessing method of FIG. 5;

FIG. 6 is a flow diagram of the example of the stereo image processingmethod;

FIG. 7 illustrates an input image being processed by a contourextraction module to produce a contour image.

Like reference numerals are used to designate like parts in theaccompanying drawings.

DETAILED DESCRIPTION

The detailed description provided below in connection with the appendeddrawings is intended as a description of the present examples and is notintended to represent the only forms in which the present example may beconstructed or utilized. The description sets forth the functions of theexample and the sequence of steps for constructing and operating theexample. However, the same or equivalent functions and sequences may beaccomplished by different examples.

In an exemplary NUI setup involving stereo image processing, a stereocamera and illumination source are placed in relation to aretro-reflective surface so as to define an interaction volume, in whicha user may provide input using hand gestures. When triggered, the stereocamera is shuttered and the illumination source pulsed simultaneouslyduring the camera exposure. The illumination causes a bright uniformresponse from the retro-reflective surface, such that the surface isreadily distinguishable as a bright silhouette. When objects interact onor above these surfaces, a sharp contrast is created between the surfaceand the object. A contour may be extracted from the silhouette, andstereo matching performed in the manner described herein. The stereomatching algorithm estimates the 3D depth of the contour of the object.

Although the present examples are described and illustrated herein asbeing implemented in a computing-based device, the system described isprovided as an example and not a limitation. As those skilled in the artwill appreciate, the present examples are suitable for application in avariety of different types of computing-based systems.

FIG. 1 illustrates a scenario involving an exemplary computing-baseddevice 200 which may be implemented as any form of a computing and/orelectronic device, and in which examples of a stereo image processingsystem and method may be implemented.

The computing-based device 200 is in communication with a stereo camera100 with an illumination source (not shown) and a display device 104. Aretro-reflective surface 102 lies in front of the computing-based device200, which in conjunction with the stereo camera 100 defines a volume ofspace in which the user is able to provide input to the computing-baseddevice 200 using hand gestures. The retro-reflective surface 102 andstereo camera 100 may be positioned in any such relationship that aninteraction volume is defined whilst the retro-reflective surfacereflects illumination back to the stereo camera 100 so as to produce ahigh-contrast or silhouette image.

As shown in FIG. 2, the computing-based device 200 comprises one or moreprocessors 202 which may be microprocessors, controllers or any othersuitable type of processors for processing computer-executableinstructions. In some examples, for example where a system on a chiparchitecture is used, the processors 202 may include one or more fixedfunction blocks (also referred to as accelerators) which implement apart of a stereo image processing method in hardware (rather thansoftware or firmware). Platform software comprising an operating system204 or any other suitable platform software may be provided at thecomputing-based device 200 to enable application software 206 and astereo image processing module 208 to be executed.

The computer executable instructions may be provided using anycomputer-readable media that is accessible by computing based device200. Computer-readable media may include, for example, computer storagemedia such as memory 216 and communications media. Computer storagemedia, such as memory 216, includes volatile and non-volatile, removableand non-removable media implemented in any method or technology forstorage of information such as computer readable instructions, datastructures, program modules or other data. Computer storage mediaincludes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memoryor other memory technology, CD-ROM, digital versatile disks (DVD) orother optical storage, magnetic cassettes, magnetic tape, magnetic diskstorage or other magnetic storage devices, or any other non-transmissionmedium that can be used to store information for access by a computingdevice. In contrast, communication media may embody computer readableinstructions, data structures, program modules, or other data in amodulated data signal, such as a carrier wave, or other transportmechanism. As defined herein, computer storage media does not includecommunication media. Therefore, a computer storage medium should not beinterpreted to be a propagating signal per se. Propagated signals may bepresent in a computer storage media, but propagated signals per se arenot examples of computer storage media. Although the computer storagemedia (memory 216) is shown within the computing-based device 200 itwill be appreciated that the storage may be distributed or locatedremotely and accessed via a network or other communication link (e.g.using communication interface 218).

The computing-based device 200 also comprises an input/output controller220 arranged to output display information to the display device 104which may be separate from or integral to the computing-based device200. The display information may provide a graphical user interface. Theinput/output controller 220 is also arranged to receive and processinput from one or more devices, such as stereo camera 100, and invarious examples also from e.g. a mouse, keyboard, camera, microphone orother sensor (not shown). In some examples the user input devices maydetect voice input, user gestures or other user actions and may providea natural user interface (NUI). In an example the display device 104 mayalso act as a user input device if it is a touch sensitive displaydevice. The input/output controller 220 may also output data to devicesother than the display device 104, e.g. a locally connected printingdevice (not shown in FIG. 2).

Any of the input/output controller 220, display device 104 and thestereo camera 100 or other input devices may comprise NUI technologyincluding but not limited to those relying on voice and/or speechrecognition, touch and/or stylus recognition (touch sensitive displays),head and eye tracking, voice and speech, vision, touch, machineintelligence, intention and goal understanding systems, motion gesturedetection using accelerometers/gyroscopes, facial recognition, 3Ddisplays, head, eye and gaze tracking, immersive augmented reality andvirtual reality systems and technologies for sensing brain activityusing electric field sensing electrodes (EEG and related methods).

FIG. 3 is a schematic diagram illustrating the stereo image processingmodule 208 of FIG. 2 in greater detail. This module may be implementedin software, hardware or any combination thereof. The stereo imageprocessing module 208 comprises a contour extraction module 302, astereo computation module 304, a depth computation module 306, and apost-processing module 308.

The inventors have recognized that, by restricting the application ofstereo image processing to sparse contours, the computational cost canbe greatly reduced, whilst increasing precision of disparity estimation.

The contour extraction module 302 performs a contour extraction processon first and second input images, captured by the stereo camera 100 atsubstantially the same time, of at least part of an object in a scene,to produce respective first and second pluralities of contour points,each defining a contour K, K′ of the at least part of the object. Theterm “contour” is used herein to refer to a number of connectedcomponents collectively defining an outer surface of an object.

In one example, as illustrated in FIG. 7, the contour extraction module302 applies a threshold to an input image 702 to produce a silhouetteimage 704, which may be a high-contrast or binary image. The contourextraction module 302 identifies transitions between contrasting regionsin the silhouette image 704 (e.g. using thresholding) to produce thefirst and second pluralities of contour points, which may be representedby a contour image 706. The object being imaged in this case comprises auser's hands 708. The first and second pluralities of contour points maybe stored as lists of contour points, for example sorted (or ordered)lists for rapid searching. In one example, the input image 702,particularly when produced by the stereo camera 100 in conjunction withthe illumination source and retro-reflective surface 102, may be ofsufficiently high contrast that it serves as the silhouette image 704,so that no thresholding is performed.

Performing the contour extraction process may comprise computing aconvex hull of at least the object in the image to extract the contourpoints defining the contour K, K′. The output of the contour extractionmodule 302 may comprise a 1D closed contour line (e.g. a 1D vector of 2Dimage points), or contour image, for each of the pair of input images,which will ultimately provide a depth for every contour point.

The stereo computation module 304 is configured to calculate stereocorrespondences between contour points on corresponding scanlines in thefirst and second contour images on the basis of contour point comparisonmetrics.

The stereo computation module 304 uses a stereo algorithm to identifycorresponding points in pairs of contour images that are projectionsfrom the same scene point. The contour images may be rectified, suchthat corresponding points are known to lie on the same horizontalscanline in the left and right contour images, which reduces the depthestimation to a 1D search task. The horizontal displacement betweencorresponding points is known as disparity and is inversely proportionalto depth. The stereo computation module 304 determines thecorrespondences between contour points that lie on a particular scanlineS.

FIG. 4 illustrates rectified left and right contour images output by thecontour extraction module 302 being processed by the stereo computationmodule 304. The left (first) contour image 402 shows a contour K of anobject having a centroid “Centroid(K)”, and the right (second) contourimage 404 shows a contour K′ of the same object again having a centroid“Centroid (K′)”. The two contour images 402, 404 are scanned line byline or in parallel, with one scanline S being illustrated in FIG. 4.

To calculate stereo correspondences, the stereo computation module 304,for each of a number of corresponding scanlines S in the first andsecond contour images 402, 404, identifies a first set of contour pointsP in the scanline S in the first contour image 402 and a second set ofcontour points Q in the scanline S in the second contour image 404. Inthe example shown in FIG. 4 there are two contour points P in thescanline S in the first contour image 402 and two contour points Q inthe scanline S in the second contour image 404 and two lists may bestored for each scanline, with each list comprising those contour pointsin the scanline. In the illustrated example, for any one of the contourpoints in the first list (for scanline S) there are only two candidatecontour points, which are the two points identified in the second list(for scanline S). In other examples, there may be more than twocandidate contour points.

More specifically, where p and q are contour points in the left andright images 402, 404, respectively, P and Q are lists of length |P| and|Q| that store the image x-coordinate of those contour points lying onthe scanline S in the left and right images 402, 404, respectively. P(i)denotes the ith element in the list P, Q(j) denotes the jth element inthe list Q, and the lists P and Q may be sorted (or ordered).

In order to identify the corresponding contour points in the left andright images, the stereo computation module 304 obtains one or morecomparison metrics for each contour point in the lists P and Q. In theexample shown, the comparison metrics comprise a measure of direction ofcurvature of the contours K, K′, that measure being in this example aunit normal vector {right arrow over (n)}(i), {right arrow over(n)}(i+1), {right arrow over (n)}(j), {right arrow over (n)}(j+1)representing each contour point. Optionally, the one or more comparisonmetrics may also comprise a centroid separation, comprising a distance,for example distC(P(i)), distC(Q(j)) as shown in FIG. 4, along thescanline S between each contour point and a centroid of the respectivecontour K, K′ on which the contour point lies. Any other metric may beused instead or in addition, alone or in combination, so long as themetric may be used to identify corresponding points in the left andright images.

The stereo computation module 304 compares the one or more comparisonmetrics of each of the first set of contour points P in the firstcontour image 402 with the one or more comparison metrics of each of thesecond set of contour points Q in the second contour image 404 (thecandidate points) to identify a best match from all the candidatepoints. This may be performed by producing a cost matrix 500, asillustrated in FIG. 5. Comparing comparison metrics may comprisecalculating, for each pair of contour points in the cost matrix 500, amagnitude of the resultant vector obtained by subtracting the unitnormal vector for one of the pair of contour points from the unit normalvector for the other of the pair of contour points. Comparing comparisonmetrics may further comprise calculating, for each pair of contourpoints in the cost matrix, a magnitude of the difference between thecentroid separation for one of the pair of contour points and thecentroid separation for the other of the pair of contour points. Thestereo computation module 304 then computes a minimum-cost path 504through the cost matrix 500 to calculate the stereo correspondencesbetween the first set of contour points and the second set of contourpoints.

As shown in FIG. 5, each cell 502 indicates the accumulated costs atC(i; j). The minimum-cost path 504 is indicated, with matching pixelsbeing marked by a circle, and those pixels along the path 504 which arenot marked by a circle being occluded in the left or right image.

In more detail, the accumulated cost of a path is defined recursively(for i>0 and j>0) as:

$\begin{matrix}{{C\left( {i,j} \right)} = {\min \left\{ \begin{matrix}{{C\left( {{i - 1},{j - 1}} \right)} + {C_{match}\left( {i,j} \right)} + {C_{smooth}\left( {i,j} \right)}} \\{{C\left( {{i - 1},j} \right)} + \lambda_{occlusion}} \\{{C\left( {i,{j - 1}} \right)} + \lambda_{occlusion}}\end{matrix} \right.}} & (1)\end{matrix}$

where the three different cases correspond to the three permitted movesdiscussed above. The boundary conditions are:

-   -   C(0,0)=C_(match)(0,0)    -   C(i,0)=i·λ_(occlusion) i>0    -   C(0,j)=j·λ_(occlusion) j>0

In Equation 1, λ_(occlusion) is a constant occlusion penalty and theremaining terms are defined as follows.

The data term C_(match) measures the compatibility of putativelymatching contour points:

C _(match)(i,j)=∥{right arrow over (n)}(i)−{right arrow over(n)}(i)∥+|dist(P(i))−distC(Q(j))|  (2)

Here, the first part measures the Euclidean distance of the normalvectors {right arrow over (n)}(i) and {right arrow over (n)}(j) atcontour points indexed by P(i) and Q(j) respectively. The second partcompares the horizontal distance of point P(i) and Q(j) to the centroidof their corresponding contours K and K′, respectively:distC(P(i))=P(i)−Centroid(K).

Producing the cost matrix (500) may comprise using a smoothing term toadjust the cost for each pair of contour points (P; Q) in the costmatrix, the smoothing term being the difference between the separationof one of the contour points in the pair from its closest neighboringcontour point and the separation of the other of the contour points inthe pair from its closest neighboring contour point. In the exampleabove, the pairwise term E_(smooth) encourages solutions where thehorizontal distance between two neighboring matching points P(i) and P(φ(i)) is similar to the distance of their matching points Q(j) andQ(l):

E _(smooth)(i,j)=∥P(i)−P(φ(i))−(Q(j)−Q(φ(j)))∥,  (3)

where function φ( ) returns the closest previous matching point for thecurrent path.

To find correspondences between the points in P and Q, the minimum-costpath through the cost matrix 500 is computed, as illustrated in FIG. 5.The cost matrix 500 is of size |P|×|Q| and each cell 502 stores C(i, j),the minimum accumulated cost of an optimal path from (0, 0) to (i, j).The minimum-cost path 504 is indicated in FIG. 5. The path may start atcell (0; 0) and end at cell (|P|, |Q|), so that a mapping for allcontour points may be used. Three exemplary moves may be used toconstruct a path: a diagonal 45° move that indicates a match, as well ashorizontal and vertical moves that represent points that are onlyvisible in the left or right image, respectively.

The restrictions imposed on the path may involve the properties ofuniqueness, since every contour point in P can only match to one pointin Q and vice versa, and ordering, since if point P(i) matches Q(j) thenP(i+1) can only match to Q(j+Δ) where Δ>0.

A box constraint may be imposed, to improve robustness. The boxconstraint may require that a dimension of a first bounding box of afirst contour K in the first contour image 402 must not exceed by morethan a predetermined amount a corresponding dimension of a secondbounding box of a second contour K′ in the second contour image 404.More particularly, where K is the contour that P(i) belongs to and K′ isthe contour that Q(j) belongs to, then if P(i) matches Q(j) the heightof the bounding boxes of K and K′ must not differ by more than apredetermined number of pixels, in one example 50 pixels.

Thus, one example uses dynamic-programming to find the path of minimalcost through the matrix 500. The optimization may consist of twoconsecutive steps. First, the cumulative costs C(i; j) are computedrecursively for every pair (i; j) as per Equation 1. At each recursionthe best “move” (i.e. the arg min of Eq. 1) is stored at (i; j) in amatrix M that has the same dimensions as the matrix 500. In the secondstep, the best path is reconstructed by tracing back the best movesstored in M starting from (|P|, |Q|) until the origin (0, 0) of M isreached. Once the best path is computed, the disparity d(i) (withrespect to the left image) can be derived as d(i)=P(i)−Q(j) for matchingpoints P(i) and P(j).

The depth computation module 306 computes depth using the calculateddisparities between the first plurality of contour points and the secondplurality of contour points to obtain a depth map. The relationshipbetween disparity and depth is determined based on the known relativepositions of the cameras which captured each of the two images.

The post-processing module 308 filters out wrong matches and assigns allpoints along the contour to a depth value. Although the estimateddisparity map is usually of high quality, there might be outliers andcontour points where no depth could be estimated (e.g. due to occlusionof this point in the other image).

The post-processing module 308 may invalidate contour points having anormal vector defining an angle with respect to the scanline S that ismore than a predetermined amount. The depth estimation can be unreliableat contour points whose normal is almost perpendicular to the scanlines,as indicated by the regions 406 marked in FIG. 5. This is because theterm C_(match) is ambiguous in those regions. In one example, contourpoints whose normal vector has an angle of ≧85° are invalidated.

The post-processing module 308 may invalidate contour points which liein separate, adjacent scanlines S along a contour in the depth map andwhose depth differs by more than a predetermined amount. The depth mapobtained with dynamic programming may be computed independently for eachscanline. As a consequence, the depth at neighboring contour points thatlie across scanlines might be inconsistent. Thus, outlier points whosedepth differs from those of the closest neighboring points along thecontour, for example by more than 3 mm, may be invalidated.

The post-processing module 308 may smooth depth values along a contourin the depth map, for example by smoothing the depth values along thecontour with a 1D mean filter of size 25 pixels. The filter can beimplemented using a sliding window technique with two operations percontour pixel.

Finally, the post-processing module 308 may assign depth values tooccluded contour points, invalidated contour points, or both. The depthvalue may be computed as a linear combination of the depth assigned tothe closest valid contour points.

In general, the complexity of a pixelwise stereo matching algorithm canbe characterized with O(WHD), where W and H are the width and height ofthe images, respectively, and D is the number of tested depthhypotheses. In many scenarios D=O(W), i.e. higher resolution imagesimply a larger set of depth hypothesis to be considered.

Restricting the computation of depth to pixels lying on the silhouettecontours may result in a substantial reduction in the search range D,since only a small set of extracted contour pixels in both images can bepotential matches.

FIG. 6 is a flow diagram of the example of the stereo image processingmethod, including contour extraction 602, stereo computation 604, depthcomputation 606, and post-processing, as described above.

The system and method as described may provide a novel stereo-matchingalgorithm that efficiently extracts 3D points along object contoursusing dynamic programming.

The method described herein may be preceded by one or more of: 1)intrinsic calibration to compute the geometric parameters of each (IR)camera lens (focal length, principal point, radial and tangentialdistortion); 2) stereo calibration to compute the geometric relationshipbetween the two camera lenses, expressed as a rotation matrix andtranslation vector; 3) stereo rectification to correct the camera imageplanes to ensure they are scanline-aligned to simplify disparitycomputation. At runtime, the synchronized input IR images are capturedfrom both cameras simultaneously. Each image may be undistorted givenintrinsic lens parameters, rectified to ensure that stereo matching canoccur directly across scanlines of left and right images, and finallycropped to ignore non-overlapping parts.

Alternatively, or in addition, the functionality described herein can beperformed, at least in part, by one or more hardware logic components.For example, and without limitation, illustrative types of hardwarelogic components that can be used include Field-programmable Gate Arrays(FPGAs), Program-specific Integrated Circuits (ASICs), Program-specificStandard Products (ASSPs), System-on-a-chip systems (SOCs), ComplexProgrammable Logic Devices (CPLDs).

The term ‘computer’ or ‘computing-based device’ is used herein to referto any device with processing capability such that it can executeinstructions. Those skilled in the art will realize that such processingcapabilities are incorporated into many different devices and thereforethe terms ‘computer’ and ‘computing-based device’ each include PCs,servers, mobile telephones (including smart phones), tablet computers,set-top boxes, media players, games consoles, personal digitalassistants and many other devices.

The methods described herein may be performed by software in machinereadable form on a tangible storage medium e.g. in the form of acomputer program comprising computer program code means adapted toperform all the steps of any of the methods described herein when theprogram is run on a computer and where the computer program may beembodied on a computer readable medium. Examples of tangible storagemedia include computer storage devices comprising computer-readablemedia such as disks, thumb drives, memory etc. and do not includepropagated signals. Propagated signals may be present in a tangiblestorage media, but propagated signals per se are not examples oftangible storage media. The software can be suitable for execution on aparallel processor or a serial processor such that the method steps maybe carried out in any suitable order, or simultaneously.

This acknowledges that software can be a valuable, separately tradablecommodity. It is intended to encompass software, which runs on orcontrols “dumb” or standard hardware, to carry out the desiredfunctions. It is also intended to encompass software which “describes”or defines the configuration of hardware, such as HDL (hardwaredescription language) software, as is used for designing silicon chips,or for configuring universal programmable chips, to carry out desiredfunctions.

Those skilled in the art will realize that storage devices utilized tostore program instructions can be distributed across a network. Forexample, a remote computer may store an example of the process describedas software. A local or terminal computer may access the remote computerand download a part or all of the software to run the program.Alternatively, the local computer may download pieces of the software asneeded, or execute some software instructions at the local terminal andsome at the remote computer (or computer network). Those skilled in theart will also realize that by utilizing conventional techniques known tothose skilled in the art that all, or a portion of the softwareinstructions may be carried out by a dedicated circuit, such as a DSP,programmable logic array, or the like.

Any range or device value given herein may be extended or alteredwithout losing the effect sought, as will be apparent to the skilledperson.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

It will be understood that the benefits and advantages described abovemay relate to one example or may relate to several examples. Theexamples are not limited to those that solve any or all of the statedproblems or those that have any or all of the stated benefits andadvantages. It will further be understood that reference to ‘an’ itemrefers to one or more of those items.

The steps of the methods described herein may be carried out in anysuitable order, or simultaneously where appropriate. Additionally,individual blocks may be deleted from any of the methods withoutdeparting from the spirit and scope of the subject matter describedherein. Aspects of any of the examples described above may be combinedwith aspects of any of the other examples described to form furtherexamples without losing the effect sought.

The term ‘comprising’ is used herein to mean including the method blocksor elements identified, but that such blocks or elements do not comprisean exclusive list and a method or apparatus may contain additionalblocks or elements.

It will be understood that the above description is given by way ofexample only and that various modifications may be made by those skilledin the art. The above specification, examples and data provide acomplete description of the structure and use of exemplary examples.Although various examples have been described above with a certaindegree of particularity, or with reference to one or more individualexamples, those skilled in the art could make numerous alterations tothe disclosed examples without departing from the spirit or scope ofthis specification.

1. A computer-implemented stereo image processing method comprising, ata processor: performing a contour extraction process on first and secondsilhouette images, captured by a stereo camera, of at least part of anobject in a scene, to produce respective first and second pluralities ofcontour points, each plurality of contour points defining a contour (K,K′) of the at least part of the object; calculating stereocorrespondences between contour points on corresponding scanlines (S) inthe first and second pluralities of contour points on the basis ofcontour point comparison metrics, the contour point comparison metricscomprising at least a centroid separation (distC), comprising a distancealong a scanline (S) between each contour point (P, Q) and a centroid ofa contour (K, K′) on which the contour point lies.
 2. The method ofclaim 1, comprising receiving a stream of first and second input imagesat a frame rate, and calculating the stereo correspondences at least atthe frame rate so that a 3D contour is output in real time.
 3. Themethod of claim 1, wherein performing the contour extraction processcomprises computing a convex hull of the at least part of the object toextract the pluralities of contour points defining the contour (K, K′).4. The method of claim 1, comprising storing at least one of: each setof contour points in an ordered list during computation; and contourpoints for each scanline (S) in lists (P, Q).
 5. The method of claim 1,wherein calculating stereo correspondences further comprises, for eachof a number of corresponding scanlines (S): identifying a first set ofcontour points (P) which lie in the scanline in the first plurality ofcontour points and a second set of contour points (Q) which lie in thescanline in the second plurality of contour points; obtaining one ormore comparison metrics (n, distC) for each contour point; comparing theone or more comparison metrics of each of the first set of contourpoints with the one or more comparison metrics of each of the second setof contour points to produce a cost matrix; and computing a minimum-costpath through the cost matrix to calculate the stereo correspondencesbetween the first set of contour points and the second set of contourpoints.
 6. The method of claim 5, wherein comparing comparison metricscomprises calculating, for each pair of contour points in the costmatrix, a magnitude of the difference between the centroid separationfor one of the pair of contour points and the centroid separation forthe other of the pair of contour points.
 7. The method of claim 1,comprising imposing one or more constraints on the calculating of stereocorrespondences, the one or more constraints comprising at least a boxconstraint requiring that a dimension of a first bounding box of a firstcontour (K) defined by the first plurality of contour points must notexceed by more than a predetermined amount a corresponding dimension ofa second bounding box of a second contour (K′) defined by the secondplurality of contour points.
 8. The method of claim 1, comprisingcomputing depth using the calculated correspondences between the firstplurality of contour points (P) and the second plurality of contourpoints (Q) to obtain a depth map.
 9. The method of claim 8, furthercomprising one or more of: invalidating contour points (P, Q) which liein separate, adjacent scanlines (S) along a contour in the depth map andwhose depth differs by more than a predetermined amount; smoothing depthvalues along a contour in the depth map; assigning depth values tooccluded contour points, invalidated contour points, or both; andinvalidating contour points (P, Q) having a normal vector (n) definingan angle with respect to the scanline (S) that is more than apredetermined amount.
 10. A stereo image processing system comprising: acontour extraction module configured to extract contours from each offirst and second silhouette images of at least part of an object in ascene, captured by a stereo camera, to produce respective first andsecond pluralities of contour points, each plurality of contour pointsdefining a contour (K, K′) of the at least part of the object; a stereocomputation module configured to calculate stereo correspondencesbetween contour points on corresponding scanlines in the first andsecond pluralities of contour points on the basis of contour pointcomparison metrics, the contour point comparison metrics comprising atleast a centroid separation (distC), comprising a distance along ascanline (S) between each contour point (P, Q) and a centroid of acontour (K, K′) on which the contour point lies.
 11. The stereo imageprocessing system of claim 10, wherein the stereo computation module isfurther configured to receive a stream of first and second input imagesat a frame rate, and calculate the stereo correspondences at least atthe frame rate so that a 3D contour is output in real time.
 12. Thestereo image processing system of claim 10, wherein the contourextraction module is further configured to compute a convex hull of theat least part of the object to extract the pluralities of contour pointsdefining the contour (K, K′).
 13. The stereo image processing system ofclaim 10, comprising a storage module configured to store at least oneof: each set of contour points in an ordered list during computation;and contour points for each scanline (S) in lists (P, Q).
 14. The stereoimage processing system of claim 10, wherein the stereo computationmodule is further configured to, for each of a number of correspondingscanlines (S): identify a first set of contour points (P) which lie inthe scanline in the first plurality of contour points and a second setof contour points (Q) which lie in the scanline in the second pluralityof contour points; obtain one or more comparison metrics (n, distC) foreach contour point; compare the one or more comparison metrics of eachof the first set of contour points with the one or more comparisonmetrics of each of the second set of contour points to produce a costmatrix; and compute a minimum-cost path through the cost matrix tocalculate the stereo correspondences between the first set of contourpoints and the second set of contour points.
 15. The stereo imageprocessing system of claim 14, wherein the stereo computation module isfurther configured to compare comparison metrics at least partly bycalculating, for each pair of contour points in the cost matrix, amagnitude of the difference between the centroid separation for one ofthe pair of contour points and the centroid separation for the other ofthe pair of contour points.
 16. The stereo image processing system ofclaim 10, comprising a constraint module configured to impose one ormore constraints on the calculating of stereo correspondences, the oneor more constraints comprising at least a box constraint requiring thata dimension of a first bounding box of a first contour (K) defined bythe first plurality of contour points must not exceed by more than apredetermined amount a corresponding dimension of a second bounding boxof a second contour (K′) defined by the second plurality of contourpoints.
 17. The stereo image processing system of claim 10, comprising adepth calculation module configured to compute depth using thecalculated correspondences between the first plurality of contour points(P) and the second plurality of contour points (Q) to obtain a depthmap.
 18. The stereo image processing system of claim 17, wherein thedepth calculation module is further configured to one or more of:invalidate contour points (P, Q) which lie in separate, adjacentscanlines (S) along a contour in the depth map and whose depth differsby more than a predetermined amount; smooth depth values along a contourin the depth map; assign depth values to occluded contour points,invalidated contour points, or both; and invalidate contour points (P,Q) having a normal vector (n) defining an angle with respect to thescanline (S) that is more than a predetermined amount.